220 research outputs found

    Fairway: A Way to Build Fair ML Software

    Full text link
    Machine learning software is increasingly being used to make decisions that affect people's lives. But sometimes, the core part of this software (the learned model), behaves in a biased manner that gives undue advantages to a specific group of people (where those groups are determined by sex, race, etc.). This "algorithmic discrimination" in the AI software systems has become a matter of serious concern in the machine learning and software engineering community. There have been works done to find "algorithmic bias" or "ethical bias" in the software system. Once the bias is detected in the AI software system, the mitigation of bias is extremely important. In this work, we a)explain how ground-truth bias in training data affects machine learning model fairness and how to find that bias in AI software,b)propose a methodFairwaywhich combines pre-processing and in-processing approach to remove ethical bias from training data and trained model. Our results show that we can find bias and mitigate bias in a learned model, without much damaging the predictive performance of that model. We propose that (1) test-ing for bias and (2) bias mitigation should be a routine part of the machine learning software development life cycle. Fairway offers much support for these two purposes.Comment: ESEC/FSE'20: The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineerin

    Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search

    Full text link
    We present a framework for quantifying and mitigating algorithmic bias in mechanisms designed for ranking individuals, typically used as part of web-scale search and recommendation systems. We first propose complementary measures to quantify bias with respect to protected attributes such as gender and age. We then present algorithms for computing fairness-aware re-ranking of results. For a given search or recommendation task, our algorithms seek to achieve a desired distribution of top ranked results with respect to one or more protected attributes. We show that such a framework can be tailored to achieve fairness criteria such as equality of opportunity and demographic parity depending on the choice of the desired distribution. We evaluate the proposed algorithms via extensive simulations over different parameter choices, and study the effect of fairness-aware ranking on both bias and utility measures. We finally present the online A/B testing results from applying our framework towards representative ranking in LinkedIn Talent Search, and discuss the lessons learned in practice. Our approach resulted in tremendous improvement in the fairness metrics (nearly three fold increase in the number of search queries with representative results) without affecting the business metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users worldwide. Ours is the first large-scale deployed framework for ensuring fairness in the hiring domain, with the potential positive impact for more than 630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201

    Data-Centric Resource Management in Edge-Cloud Systems for the IoT

    Get PDF
    A major challenge in emergent scenarios such as the Cloud-assisted Internet of Things is efficiently managing the resources involved in the system while meeting requirements of applications. From the acquisition of physical data to its transformation into valuable services or information, several steps must be performed, involving the various players in such a complex ecosystem. Support for decentralized data processing on IoT devices and other devices near the edge of the network, in combination with the benefits of cloud technologies has been identified as a promising approach to reduce communication overhead, thus reducing delay for time sensitive IoT applications. The interplay of IoT, edge and cloud to achieve the final goal of producing useful information and value-added services to end user gives rise to a management problem that needs to be wisely tackled. The goal of this work is to propose a novel resource management framework for edge-cloud systems that supports heterogeneity of both devices and application requirements. The framework aims to promote the efficient usage of the system resources while leveraging the Edge Computing features, to meet the low latency requirements of emergent IoT applications. The proposed framework encompasses (i) a lightweight and data-centric virtualization model for edge devices, (ii) a set of components responsible for the resource management and the provisioning of services from the virtualized edge-cloud resources

    Application of Ultrasonic Beam Modeling to Phased Array Testing of Complex Geometry Components

    Get PDF
    For several years, the French Atomic Energy Commission (CEA) has developed phased array techniques to improve defect characterization and adaptability to various inspection configurations [1]. Such techniques allow to steer and focus the ultrasonic beam radiated by a transducer split into a set of individually addressed elements, using amplitude and delay laws. For most conventional systems, those delay laws are extracted from geometric ultrasonic paths between each element of the array and a geometric focusing applied to perform beam-forming abilities [2] for simple geometry components (e.g. beam- steering over a plane specimen), whereas experimental delays can be supplied to the array at transmission and reception to optimally adapt the ultrasonic beam to the detected defect, in a so-called self-focusing process [3,4]. This method, relevant for complex material or geometry leading to phase distortion or complex paths that cannot be predicted by simple geometrical calculations, obviously requires the existence of a reflector and the ultrasonic beam radiated by the experimental delay law cannot be known. Therefore this technique is used to improve defect detection (optimal sensibility) rather than defect characterization. To assess complex geometry components inspection with an adaptive system, the CEA has developed new modeling devoted to predict the ultrasonic field radiated by arbitrary transducers through complex geometry and material specimen [5]. A model allows to compute optimized delay laws to preserve the characteristics of the beam through the complex surface, as well as the actual radiated field using those delays. This paper presents two applications of this model : the inspection of a misaligned specimen, and the inspection of an irregular surface

    Penile Carcinoma: Risk Factors and Molecular Alterations

    Get PDF
    Penile carcinoma is a rare, male cancer. Although the incidence of penile carcinoma is very low in Western countries, in some countries, the incidence is significantly greater, with penile carcinoma accounting for ≤10% of all male malignancies. Greater insight has been gained in recent years as to its pathogenesis, the risk factors associated with its development, and the clinical and histological precursor lesions related to this disease. In this review, risk and conditions factors for penile carcinoma, molecular alterations in this type of cancer, histological types, and prognostic factors will be discussed in order to further our understanding of the biology and behavior of this cancer

    Reconciling Utility with Privacy in Genomics

    Get PDF
    Direct-to-consumer genetic testing makes it possible for everyone to learn their genome sequences. In order to contribute to medical research, a growing number of people publish their genomic data on the Web, sometimes under their real identities. However, this is at odds not only with their own privacy but also with the privacy of their relatives. The genomes of relatives being highly correlated, some family members might be opposed to revealing any of the family's genomic data. In this paper, we study the trade-off between utility and privacy in genomics. We focus on the most relevant kind of variants, namely single nucleotide polymorphisms (SNPs). We take into account the fact that the SNPs of an individual contain information about the SNPs of his family members and that SNPs are correlated with each other. Furthermore, we assume that SNPs can have different utilities in medical research and different levels of sensitivity for individuals. We propose an obfuscation mechanism that enables the genomic data to be publicly available for research, while protecting the genomic privacy of the individuals in a family. Our genomic-privacy preserving mechanism relies upon combinatorial optimization and graphical models to optimize utility and meet privacy requirements. We also present an extension of the optimization algorithm to cope with the non-linear constraints induced by the correlations between SNPs. Our results on real data show that our proposed technique maximizes the utility for genomic research and satisfies family members' privacy constraints

    Development of the SIOPE DIPG network, registry and imaging repository : a collaborative effort to optimize research into a rare and lethal disease

    Get PDF
    Diffuse intrinsic pontine glioma (DIPG) is a rare and deadly childhood malignancy. After 40 years of mostly single-center, often non-randomized trials with variable patient inclusions, there has been no improvement in survival. It is therefore time for international collaboration in DIPG research, to provide new hope for children, parents and medical professionals fighting DIPG. In a first step towards collaboration, in 2011, a network of biologists and clinicians working in the field of DIPG was established within the European Society for Paediatric Oncology (SIOPE) Brain Tumour Group: the SIOPE DIPG Network. By bringing together biomedical professionals and parents as patient representatives, several collaborative DIPG-related projects have been realized. With help from experts in the fields of information technology, and legal advisors, an international, web-based comprehensive database was developed, The SIOPE DIPG Registry and Imaging Repository, to centrally collect data of DIPG patients. As for April 2016, clinical data as well as MR-scans of 694 patients have been entered into the SIOPE DIPG Registry/Imaging Repository. The median progression free survival is 6.0 months (95% Confidence Interval (CI) 5.6-6.4 months) and the median overall survival is 11.0 months (95% CI 10.5-11.5 months). At two and five years post-diagnosis, 10 and 2% of patients are alive, respectively. The establishment of the SIOPE DIPG Network and SIOPE DIPG Registry means a paradigm shift towards collaborative research into DIPG. This is seen as an essential first step towards understanding the disease, improving care and (ultimately) cure for children with DIPG.Peer reviewe
    corecore